PepSOM: an algorithm for peptide identification by tandem mass spectrometry based on SOM.

نویسندگان

  • Kang Ning
  • Hoong Kee Ng
  • Hon Wai Leong
چکیده

Peptide identification by tandem mass spectrometry is both an important and challenging problem in proteomics. At present, huge amount of spectrum data are generated by high throughput mass spectrometers at a very fast pace, but algorithms to analyze these spectra are either too slow, not accurate enough, or only gives partial sequences or sequence tags. In this paper, we emphasize on the balance between identification completeness and efficiency with reasonable accuracy for peptide identification by tandem mass spectrum. Our method works by converting spectra to vectors in high-dimensional space, and subsequently use self-organizing map (SOM) and multi-point range query (MPRQ) algorithm as a coarse filter reduce the number of candidates to achieve efficient and accurate database search. Experiments show that our algorithm is both fast and accurate in peptide identification.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An accurate and efficient algorithm for Peptide and ptm identification by tandem mass spectrometry.

Peptide identification by tandem mass spectrometry (MS/MS) is one of the most important problems in proteomics. Recent advances in high throughput MS/MS experiments result in huge amount of spectra. Unfortunately, identification of these spectra is relatively slow, and the accuracies of current algorithms are not high with the presence of noises and post-translational modifications (PTMs). In t...

متن کامل

Two-phase Filtering Strategy for Efficient Peptide Identification from Mass Spectrometry.

Peptide identification by tandem mass spectrometry (MS/MS) is one of the most important problems in proteomics. Recent advances in high throughput MS/MS experiments result in huge amount of spectra, and the peptide identification process should keep pace. In this paper, we strive to achieve high accuracy and efficiency for peptide identification with the presence of noise by a two-phase filteri...

متن کامل

An efficient algorithm for the blocked pattern matching problem

MOTIVATION Tandem mass spectrometry (MS) has become the method of choice for protein identification and quantification. In the era of big data biology, tandem mass spectra are often searched against huge protein databases generated from genomes or RNA-Seq data for peptide identification. However, most existing tools for MS-based peptide identification compare a tandem mass spectrum against all ...

متن کامل

A new algorithm for the evaluation of shotgun peptide sequencing in proteomics: support vector machine classification of peptide MS/MS spectra and SEQUEST scores.

Shotgun tandem mass spectrometry-based peptide sequencing using programs such as SEQUEST allows high-throughput identification of peptides, which in turn allows the identification of corresponding proteins. We have applied a machine learning algorithm, called the support vector machine, to discriminate between correctly and incorrectly identified peptides using SEQUEST output. Each peptide was ...

متن کامل

Exploiting the kernel trick to correlate fragment ions for peptide identification via tandem mass spectrometry

MOTIVATION The correlation among fragment ions in a tandem mass spectrum is crucial in reducing stochastic mismatches for peptide identification by database searching. Until now, an efficient scoring algorithm that considers the correlative information in a tunable and comprehensive manner has been lacking. RESULTS This paper provides a promising approach to utilizing the correlative informat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Genome informatics. International Conference on Genome Informatics

دوره 17 2  شماره 

صفحات  -

تاریخ انتشار 2006